Getting Started Guide
Obfusware experience: Beginner
Requirements:
A Windows, MacOS, or Linux computer with internet browser
Obfusware AG installed in AWS environment
Approximate time to complete: 20 minutes
Last updated: 16 Aug 2025
Now that Obfusware is installed to AWS, you are ready to create your first Obfusware AWS job and mask a dataset. AWS Glue studio allows you to visually construct a Glue job in just a few steps.
-
Go to the AWS Glue studio
https://console.aws.amazon.com/gluestudio/home#/jobs
You may be asked to log-in if you are not currently. You will be taken to the Glue studio page after completing the login. -
Create a new glue job
Click the Visual ETL button in the Create job section of the page. For detailed instructions on building visual ETL jobs, see the AWS documentation.
https://docs.aws.amazon.com/glue/latest/dg/author-job-glue.html
- Set the Job details
Click on the Job details tab.
- Set the Basic property fields to the following values
Field | Default | Value |
---|---|---|
Name | Untitled job | First Obfusware job |
Description | <empty> | <empty> |
IAM Role | <empty> | ObfuswareGlueRole |
Type | Spark | Spark |
Glue version | Glue 5.0 | Glue 5.0 |
Language | Python 3 | Python 3 |
Worker type | G 1X | G 1X |
Automatically scale the number of workers | unchecked | unchecked |
Requested number of workers | 10 | 2 |
Generate job insights | checked | checked |
Generate lineage events | unchecked | unchecked |
Job bookmark | Disable | Disable |
Join run queuing | unchecked | unchecked |
Flex execution | unchecked | unchecked |
Number of retries | 0 | 0 |
Job timeout (minutes | 480 | 5 |
- Save the job details
After setting the appropriate job details, save the job by clicking the Save button in the top right corner of the page.
- Build the job
Select the Visual tab
- Add a source node by clicking the + icon.
- Select the Amazon S3 source.
- And set the Amazon S3 source parameters:
- S3 Source Type: S3 location
- S3 Url: s3://obfusware-381492123456-us-east-1/3.0/resources/sample-data.csv
- Data format: CSV
- Delimiter: Comma (,)
- Select the Obfusware Column Data Transform and set the transform parameters:
- Masker 1: USLastNameMasker
- Column 1: last_name
- Masker 2: USVariableDateMasker
- Column 2: dob
- Masker 3: US555TelephoneMasker
- Column 3: phone1
- Masker 4: EmailMasker
- Column 4: email
- Select the Amazon S3 Target and set the Amazon S3 target parameters:
- Format: CSV
- Compression Type: None
- S3 Target Location: s3://obfusware-381492123456-us-east-1/3.0/output/
- Save the job by clicking the Save button on the top right of the page.
Enabling Obfusware AWS Glue jobs
Now that your First Obfusware Job has been created, there is one more step you need to complete before you can run the job. Obfusware relies on a tight integration with AWS Glue. In order to achieve this integration, Obfusware requires its code, in the form of jar files to and python files to be accessible by AWS Glue.
While it is possible to manually enable an AWS Glue job by setting some Job details advanced properties, it is not simple and a little error prone, so Obfusware provides a management tool to enable a job for you.
Obfusware-manager CLI tool
$ obfusware-manager enable-job --help
usage: ObfuswareAWSGlueCLI enable-job [-h] jobs [jobs ...]
positional arguments:
jobs Names of existing AWS Glue jobs which will be enabled to execute Obfusware transforms
optional arguments:
-h, --help show this help message and exit
This tool is installed on the computer and user account used to initially install Obfusware. The tool is located in the bin install directory.
On MacOS or Linux the bin install directory is located at:
$HOME/.obfusware-aws/<version>/bin
On Windows the bin install directory is located at:
%USERPROFILE%\obfusware-aws\<version>\bin
To enable the First Obfusware Job, simply run the command:
<bininstalldir>/obfusware-manager -v enable-job “First Obfusware Job”
You should see the following output from the obfusware-manager command:
Enabling AWS Glue jobs to execute Obfusware transforms...
Enabling AWS Glue job(First Obfusware Job)
Enabling AWS Glue jobs SUCCEEDED
Success
The Obfusware First Job is now ready to run.
Running the Obfusware job
To run the job, select the Runs tab. Then click the Run button on the top right of the page.
When the job finishes running, in approximately 1:30-1:45 minutes, the run status will change to Succeeded.
To see the result of the run, you can compare the original file (sample-data.csv
) with the results of the run stored in the s3 output/
folder.
Comparing the results
The location and name of the source file, sample-data.csv
, is well known, but while the location of the target file is known, the exact name is generated by the job. Therefore, to compare the results you first need to list the contents of the output/
folder to discover the name. To generate a listing you can run the command:
$ aws s3 ls s3://obfusware-381492125655-us-east-1/3.0/output/
2025-08-01 10:24:46 0
2025-08-01 10:53:48 47600073 run-1754060013607-part-r-00000
Look for the file with a timestamp that matches the end of the job run. Once you have discovered the name of the result file you can compare the contents of the source file with the contents of the target file.
On MacOS or Linux:
$ aws s3 cp s3://obfusware-381492123456-us-east-1/3.0/resources/sample-data.csv - | head -2
first_name,last_name,dob,company_name,address,city,county,state,zip,phone1,phone2,email,web
James,Butt,4/1/1997,"Benton, John B Jr",6649 N Blue Gum St,New Orleans,Orleans,LA,70116,504-621-8927,504-845-1427,jbutt@gmail.com,http://www.bentonjohnbjr.com
$ aws s3 cp s3://obfusware-381492125655-us-east-1/3.0/output/run-1754060013607-part-r-00000 - | head -2`
first_name,last_name,dob,company_name,address,city,county,state,zip,phone1,phone2,email,web`
James,Ketchersid,5/1/1997,"Benton, John B Jr","6649 N Blue Gum St","New Orleans",Orleans,LA,70116,504-555-7562,504-845-1427,freddy6791@example.com,"http://www.bentonjohnbjr.com"
On Windows:
> aws s3 cp s3://obfusware-381492123456-us-east-1/3.0/resources/sample-data.csv - | more
first_name,last_name,dob,company_name,address,city,county,state,zip,phone1,phone2,email,web
James,Butt,4/1/1997,"Benton, John B Jr",6649 N Blue Gum St,New Orleans,Orleans,LA,70116,504-621-8927,504-845-1427,jbutt@gmail.com,http://www.bentonjohnbjr.com
...
> aws s3 cp s3://obfusware-381492125655-us-east-1/3.0/output/run-1754060013607-part-r-00000 - | more
first_name,last_name,dob,company_name,address,city,county,state,zip,phone1,phone2,email,web
James,Ketchersid,5/1/1997,"Benton, John B Jr","6649 N Blue Gum St","New Orleans",Orleans,LA,70116,504-555-7562,504-845-1427,freddy6791@example.com,http://www.bentonjohnbjr.com
...
By comparing the source fields (last_name, dob, phone1, email
) to the corresponding target fields, you can see the results of masking the selected fields.